AITopics

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.59)

Neural Information Processing SystemsFeb-8-2026, 11:56:42 GMT

A Proofs As mentioned in sections 3 and 4, our dataset D contains the random perturbation vectors ξ and side information

B.1 Loss function Mathematically, the conditional total variation loss function (11) can be explicitly written as: L The joint loss minimization task is performed using the following network architecture which has 2 parallel networks training simultaneously. The decoder is a mirrored version of the encoder. Here, given the time series nature of the data, we follow the rolling window approach for network training. This is shown in algorithm 2. Once the In this section, we discuss the data generation process for the simulated data used in section 5.1. The data is generated using [Page Jr, 1984].

artificial intelligence, iterative constraint generation, machine learning, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceDec-1-2025

Counting Still Counts: Understanding Neural Complex Query Answering Through Query Relaxation

Brunink, Yannick, Daza, Daniel, He, Yunjie, Cochez, Michael

Neural methods for Complex Query Answering (CQA) over knowledge graphs (KGs) are widely believed to learn patterns that generalize beyond explicit graph structure, allowing them to infer answers that are unreachable through symbolic query processing. In this work, we critically examine this assumption through a systematic analysis comparing neural CQA models with an alternative, training-free query relaxation strategy that retrieves possible answers by relaxing query constraints and counting resulting paths. Across multiple datasets and query structures, we find several cases where neural and relaxation-based approaches perform similarly, with no neural model consistently outperforming the latter. Moreover, a similarity analysis reveals that their retrieved answers exhibit little overlap, and that combining their outputs consistently improves performance. These results call for a re-evaluation of progress in neural query answering: despite their complexity, current models fail to subsume the reasoning patterns captured by query relaxation. Our findings highlight the importance of stronger non-neural baselines and suggest that future neural approaches could benefit from incorporating principles of query relaxation.

artificial intelligence, natural language, question answering, (18 more...)

2511.22565

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Neural Information Processing SystemsAug-14-2025, 08:17:56 GMT

A Proofs As mentioned in sections 3 and 4, our dataset D contains the random perturbation vectors ξ and side information

iterative constraint generation, representation, side information, (12 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Werner, Thorben, Schmidt-Thieme, Lars, Yalavarthi, Vijaya Krishna

The Role of Active Learning in Modern Machine Learning

arXiv.org Artificial IntelligenceAug-4-2025

Even though Active Learning (AL) is widely studied, it is rarely applied in contexts outside its own scientific literature. We posit that the reason for this is AL's high computational cost coupled with the comparatively small lifts it is typically able to generate in scenarios with few labeled points. In this work we study the impact of different methods to combat this low data scenario, namely data augmentation (DA), semi-supervised learning (SSL) and AL. We find that AL is by far the least efficient method of solving the low data problem, generating a lift of only 1-4\% over random sampling, while DA and SSL methods can generate up to 60\% lift in combination with random sampling. However, when AL is combined with strong DA and SSL techniques, it surprisingly is still able to provide improvements. Based on these results, we frame AL not as a method to combat missing labels, but as the final building block to squeeze the last bits of performance out of data after appropriate DA and SSL methods as been applied.

artificial intelligence, inductive learning, machine learning, (13 more...)

2508.00586

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.35)

Neural Information Processing SystemsMay-27-2025, 04:13:14 GMT

Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization

large language model, natural language, optimization, (8 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)

Neural Information Processing SystemsOct-7-2024, 12:35:41 GMT

Reviews: Regret Bounds for Online Portfolio Selection with a Cardinality Constraint

Summary The paper studies the online portfolio selection problem under cardinality constraints and provides two algorithms that achieve sublinear regret. One algorithm handles the full information setting and the other algorithm handles the bandit feedback setting. Furthermore, the paper provides lower bounds for both the full information and bandit feedback settings. The approach that both algorithms take is to split the problem into two learning problems. One problem is to learn the optimal combination of assets and the other problem is to learn the optimal portfolio. To learn the optimal combination of assets a version of either the multiplicative weights algorithm (full information) or exp3 (bandit feedback) is used.

algorithm, baseline algorithm, online portfolio selection, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

arXiv.org Artificial IntelligenceOct-7-2024

ImProver: Agent-Based Automated Proof Optimization

Ahuja, Riyaz, Avigad, Jeremy, Tetali, Prasad, Welleck, Sean

Large language models (LLMs) have been used to generate formal proofs of mathematical theorems in proofs assistants such as Lean. However, we often want to optimize a formal proof with respect to various criteria, depending on its downstream use. For example, we may want a proof to adhere to a certain style, or to be readable, concise, or modularly structured. Having suitably optimized proofs is also important for learning tasks, especially since human-written proofs may not optimal for that purpose. To this end, we study a new problem of automated proof optimization: rewriting a proof so that it is correct and optimizes for an arbitrary criterion, such as length or readability. As a first method for automated proof optimization, we present ImProver, a large-language-model agent that rewrites proofs to optimize arbitrary user-defined metrics in Lean. We find that naively applying LLMs to proof optimization falls short, and we incorporate various improvements into ImProver, such as the use of symbolic Lean context in a novel Chain-of-States technique, as well as error-correction and retrieval. We test ImProver on rewriting real-world undergraduate, competition, and research-level mathematics theorems, finding that ImProver is capable of rewriting proofs so that they are substantially shorter, more modular, and more readable.

improver, optimization, theorem, (15 more...)

2410.04753

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Monaco (0.04)

Genre:

Research Report (0.40)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Chen, Boyuan, Zhu, Mingzhi, Dolan-Gavitt, Brendan, Shafique, Muhammad, Garg, Siddharth

Model Cascading for Code: Reducing Inference Costs with Model Cascading for LLM Based Code Generation

arXiv.org Artificial IntelligenceMay-24-2024

The rapid development of large language models (LLMs) has led to significant advancements in code completion tasks. While larger models have higher accuracy, they also cost much more to run. Meanwhile, model cascading has been proven effective to conserve computational resources while enhancing accuracy in LLMs on natural language generation tasks. It generates output with the smallest model in a set, and only queries the larger models when it fails to meet predefined quality criteria. However, this strategy has not been used in code completion tasks, primarily because assessing the quality of code completions differs substantially from assessing natural language, where the former relies heavily on the functional correctness. To address this, we propose letting each model generate and execute a set of test cases for their solutions, and use the test results as the cascading threshold. We show that our model cascading strategy reduces computational costs while increases accuracy compared to generating the output with a single model. We also introduce a heuristics to determine the optimal combination of the number of solutions, test cases, and test lines each model should generate, based on the budget. Compared to speculative decoding, our method works on black-box models, having the same level of cost-accuracy trade-off, yet providing much more choices based on the server's budget. Ours is the first work to optimize cost-accuracy trade-off for LLM code generation with model cascading.

accuracy, dataset, model family, (15 more...)

2405.15842

Country:

North America > United States > New York (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-9-2024

A Unified Model for Spatio-Temporal Prediction Queries with Arbitrary Modifiable Areal Units

Chen, Liyue, Fang, Jiangyi, Liu, Tengfei, Cao, Shaosheng, Wang, Leye

Spatio-Temporal (ST) prediction is crucial for making informed decisions in urban location-based applications like ride-sharing. However, existing ST models often require region partition as a prerequisite, resulting in two main pitfalls. Firstly, location-based services necessitate ad-hoc regions for various purposes, requiring multiple ST models with varying scales and zones, which can be costly to support. Secondly, different ST models may produce conflicting outputs, resulting in confusing predictions. In this paper, we propose One4All-ST, a framework that can conduct ST prediction for arbitrary modifiable areal units using only one model. To reduce the cost of getting multi-scale predictions, we design an ST network with hierarchical spatial modeling and scale normalization modules to efficiently and equally learn multi-scale representations. To address prediction inconsistencies across scales, we propose a dynamic programming scheme to solve the formulated optimal combination problem, minimizing predicted error through theoretical analysis. Besides, we suggest using an extended quad-tree to index the optimal combinations for quick response to arbitrary modifiable areal units in practical online scenarios. Extensive experiments on two real-world datasets verify the efficiency and effectiveness of One4All-ST in ST prediction for arbitrary modifiable areal units. The source codes and data of this work are available at https://github.com/uctb/One4All-ST.

grid, optimal combination, prediction, (15 more...)

2403.07022

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > Alaska > Anchorage Municipality > Anchorage (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (0.88)
Transportation > Passenger (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)